Finding multivariate outliers in fMRI time-series data

نویسندگان

  • John F. Magnotti
  • Nedret Billor
چکیده

A fundamental challenge for researchers studying the brain is to explain how distributed patterns of brain activity relate to a specific representation or computation. Multivariate techniques are therefore becoming increasingly popular for pattern localization of functional magnetic resonance imaging (fMRI) data. The increased power of these techniques can be offset by their susceptibility to multivariate outliers, a problem not directly encountered when fMRI data are analyzed in more common univariate analysis techniques. We test how two algorithms, High Dimensional Blocked Adaptive Computationally Efficient Outlier Nominators (HD BACON) and Principal Component based Outlier detection (PCOut), can detect multivariate outliers in high-dimensional fMRI data, in which the number of variables is larger than the number of observations. We show how these methods can be applied to individual, voxel time-series to identify outlying voxels within a region of interest. Finally, we compare these methods with simulated data to identify which aspects of the data each method is most sensitive to. Voxels identified by both algorithms were primarily on the edges of univariate activation clusters and near the boundaries between different tissue types. Simulation results showed the PCOut outperformed HD BACON, maintaining both high sensitivity and specificity across a wide range of outlier contamination percentages. Our results suggest that multivariate analysis of fMRI can benefit from including multivariate outlier detection as a routine data quality check prior to model fitting.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Missing data imputation in multivariable time series data

Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...

متن کامل

Outlier Detection in Multivariate Time Series via Projection Pursuit

This article uses Projection Pursuit methods to develop a procedure for detecting outliers in a multivariate time series. We show that testing for outliers in some projection directions could be more powerful than testing the multivariate series directly. The optimal directions for detecting outliers are found by numerical optimization of the kurtosis coefficient of the projected series. We pro...

متن کامل

Outlier Detection in Multivariate Time Series by Projection Pursuit

In this article we use projection pursuit methods to develop a procedure for detecting outliers in a multivariate time series. We show that testing for outliers in some projection directions can be more powerful than testing the multivariate series directly. The optimal directions for detecting outliers are found by numerical optimization of the kurtosis coefficient of the projected series. We ...

متن کامل

New Proposals in Multivariate Outliers Identification

Occurrences of outliers in multivariate time series are unpredictable events which may severely distort the analysis of the series. It may be noticed that a convenient way for representing multiple outliers consists in superimposing a deterministic disturbance to a Gaussian multivariate time series. Then outliers may be modelled as non – Gaussian time series components. The independent componen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computers in biology and medicine

دوره 53  شماره 

صفحات  -

تاریخ انتشار 2014